By the end of this hands-on tutorial you should be familiar with:
The minimum knowledge to benefit from the tutorial includes being familiar with:
You are leading a research project involving a systematic review of multiple interventions for a specific health condition and patient population. You have put in a lot of effort to establish a high-standards protocol, registered it with PROSPERO register and completed the data extraction.
While reviewing the extracted data, you notice that many eligible trials report having missing participants. How are you going to proceed with the analysis?
Consider the small fictional trial in Figure 1, where 20 participants have been randomised into two interventions and the outcome is binary (symptoms improved or worsened after intervetion). Some participants completed the trial (completers), while others left the trial prematurely for several reasons (missing). The goal is to draw conclusions using the entire randomised sample.
Figure 1: A fictional small trial of two interventions
Without individual participant data, there are only a few methods to handle missing participants in (network) meta-analysis:
Both exclusion and imputation are suboptimal methods in handling missing participants. Exclusion, as shown in Figure 2, is not appropriate as it reduced the sample to those participants who remained in the trial and decreases the power to detect significant associations. Moreover, if the missingness mechanism is not M(C)AR, the risk of estimating a biased intervention effect (e.g., odds ratio) is imminent.
Figure 2: Excluding missing participants from both interventions
Figure 3 illustrates the imputation of missing participants in two interventions. In in the first intervention, it is assumed that all missing participants experienced symptom improvement and thus, left the trial prematurely. In the second intervention, it is assumed that all missing participants experienced symptom deterioration and left the trial prematurely. Imputation implies adding the number of missing participants to the outcome based on the corresponding scenario: to the number of events (improvement) in the first intervention and to the number of non-events (deterioration) in the second intervention. This method preserves the randomised sample but assumes the missingness scenario. Without following the missing participants to record their outcome, we cannot know the true missingness mechanism. Hence, any assumptions we make about the missingness mechanism should naturally propagate in the estimation of the intervention effect by increasing its standard error. Imputation rears its ugly head by ‘treating’ the missing participants as observed, increasing the risk to estimate a biased intervention effect and draw erroneous statistical significance conclusions.
Figure 3: Imputing missing participants making different scenarios in each intervention
(Network) meta-analysis inherits the limitations of exclusion and imputation observed at trial-level via the inclusion of trials with missing participants. As you do not have individual participant data to make use of different cool methods for missing participants, the quality of the (network) meta-analysis results will strongly depend, among others things, on your good guess about the missingness mechanism and how you incorporate this good guess into the results. We have good news on that; just, keep reading.
The last member of the squad of methods for missing participants in (network) meta-analysis is the pattern-mixture model. The pattern-mixture model elegantly adjusts, rather than fixes, the average outcome (in this case, the risk of event) to the assumed missingness mechanism, and propagates this assumption to the standard error of the estimated intervention effect, while maintaining the randomised sample. It allows for different assumptions about the missing mechanism between interventions or across the trials, offering a flexible framework for properly handling missing participants in (network) meta-analysis. If exclusion, imputation and pattern-mixture model were characters from the epic Western spaghetti film, The Good, the Bad and the Ugly, the correspondence of who would be who is rather clear.
Figure 4 shows two normal distributions of
the informative missingness odds ratio (IMOR), which is
a central parameter in the pattern-mixture model for a binary outcome.
IMOR is similar to the odds ratio for two interventions, as it compares
the odds of an event in missing participants with the odds of an event
in completers. An IMOR equal to 1 indicates the MAR mechanism (i.e., no
difference in the compared groups), and an IMOR different from 1 implies
the MNAR mechanism (i.e., the compared groups differ). To illustrate
this, we have considered a different IMOR distribution for each
intervention. For the the first intervention, we assumed that, on
average, the odds of an event
in completers were twice that in missing participants (the mode), with
other scenarios being less likely to occur as we deviate from the mode.
For the second intervention, we assumed that, on average, the odds of an
event in missing participants were twice that in completers (the mode),
again, with other scenarios being less likely to occur as we deviate
from the mode. It is important to note that the normal distributions
presented in Figure 4 are implied for the IMOR
in the logarithmic scale, which is commonly used when estimating the
odds ratio.
Figure 4: The IMOR parameter under different scenarios for each intervention
The rnmamod R package was originally developed to address the lack of dedicated functions for handling missing participants in pairwise and network meta-analysis in existing R packages. Over time, our goal expanded to create a comprehensive suite of functions for performing and visualizing Bayesian pairwise and network meta-analysis. The package implements the pattern-mixture model for Proper handling of missing participants in all models. In this tutorial, we will walk you through the core functions of the package and provide a comprehensive visual summary of the results.
You may install and load the package directly from CRAN by running the code below:
install.packages("rnmamod")
library(rnmamod)
However, we recommend installing the development version to experience the latest advances in the package:
install.packages("devtools")
devtools::install_github("LoukiaSpin/rnmamod")
We will use the dataset of Baker et
al. (2009), which includes 21 trials comparing seven pharmacological
interventions and placebo in chronic obstructive pulmonary disease
(COPD) patients. The aim is to determine if patients experienced COPD
exacerbation after receiving the randomised intervention. Run
head(nma.baker2009) to see the first six trials, or
nma.baker2009 to see the entire dataset. The dataset has
one-trial-per-row format, which is the typical format encountered in
BUGS language models.
head(nma.baker2009)
#> study t1 t2 t3 t4 r1 r2 r3 r4 m1 m2 m3 m4 n1 n2 n3 n4
#> Llewellyn-Jones, 1996 1 4 NA NA 3 0 NA NA 1 0 NA NA 8 8 NA NA
#> Paggiaro, 1998 1 4 NA NA 51 45 NA NA 27 19 NA NA 139 142 NA NA
#> Mahler, 1999 1 7 NA NA 47 28 NA NA 23 9 NA NA 143 135 NA NA
#> Casaburi, 2000 1 8 NA NA 41 45 NA NA 18 12 NA NA 191 279 NA NA
#> van Noord, 2000 1 7 NA NA 18 11 NA NA 8 7 NA NA 50 47 NA NA
#> Rennard, 2001 1 7 NA NA 41 38 NA NA 29 22 NA NA 135 132 NA NA
To create a network plot use the netplot function (type
?netplot to learn more):
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# Create the network plot and tables
netplot(data = nma.baker2009,
drug_names = interv_names,
text.cex = 1.5)
The function currently calls the
nma.networkplot function
from the pcnetmeta R
package. The netplot function gives further insights into
the network evidence by printing the following three tables on the
console:
* A summary of the trials, randomised sample, and outcome data
for each observed comparison in the network
The nma.baker2009 dataset was selected due to its high
number of missing participants in most trials, interventions and
comparisons. Use the heatmap_missing_dataset function to
view the distribution of missing participants (in percent) in the
dataset. Most trials have interventions with high level of missingness,
with only a few trials having zeror or low missingness rate.
# Abbreviated interventions (in the order they appear in the datasets)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# Create the heatmap of % missing participants for each trial and arm
heatmap_missing_dataset(data = nma.baker2009,
trial_names = nma.baker2009$study,
drug_names = abbr_interv_names)
Use the heatmap_missing_network function to view the
median and range of missing participants (in percent) for each
intervention (on the main diagonal) and comparison (lower
off-diagonal):
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# Create the heatmap of % missing participants for each intervention and observed comparison
heatmap_missing_network(data = nma.baker2009,
drug_names = abbr_interv_names)
We can immediately spot the comparison with the lowest median of missing
participants: tiotropium versus formoterol.
This section introduces you to the backbone function of the
rnmamod architecture: the run_model. This function conducts
Bayesian network meta-analysis while accounting for
missing participants using the pattern-mixture model. Most functions in
the package cannot work without the run_model function and,
on its own, it returns results for a bulk of model parameters. The magic
happens once the run_model function is fed into the other
functions of the package, which enable you to run all necessary analyses
and obtain various graphs to understand and interpret the results.
Let’s run a Bayesian random-effects (model = "RE")
network meta-analysis while addressing missing participants reported in
each intervention arm of every trial. The effect measure of interest is
the odds ratio (measure = "OR"). A half-normal prior
distribution is used for the between-trial standard deviation with scale
parameter equal to 1
(heter_prior = list("halfnormal", 0, 1)) and one IMOR
parameter is estimated for each intervention in the network
(assumption = "IDE-ARM"). The model operates under the
MAR assumption on average (mean_misspar = c(0, 0))
with a variance equal to 1 (var_misspar = 1) for the
outcome being harmful (D = 0). Assign a name to the
function to use it as an object in the other functions in the model. The
run_model function and all other model functions run in JAGS.
## Run a random-effects network meta-analysis
res <- run_model(data = nma.baker2009,
measure = "OR",
model = "RE",
assumption = "IDE-ARM",
heter_prior = list("halfnormal", 0, 1),
mean_misspar = c(0, 0),
var_misspar = 1,
D = 0,
n_chains = 3,
n_iter = 10000,
n_burnin = 1000,
n_thin = 1)
On the console you see will two sets of progress bars: the first one
shows the progress of the simulation. Once the model completes the
iterations for all chains, a second progress bar appears on the console
that shows the progress of the model update for
n_iter = 10000 iterations until convergence:
Specifying carefully the arguments of the
run_modelfunction is essential for the subsequent analyses to yield sensible results. This requires thorough consideration of the proper effect measure (measure), the meta-analysis model (model), the structure of the missingness parameter (assumption), the prior distribution for the heterogeneity parameter (heter_prior), the missingness mechanism (mean_missparandvar_misspar), the direction of the outcome (D), and the necessary tuning parameters to run Markov chain Monte Carlo simulations using JAGS (n_chains,n_iter,n_burning, andn_thin). Refer to the documentation of the function by typing?run_model.
Now we can use res in the arguments of the other model
functions to conduct the subsequent analyses, and visualize the results.
The journey has just begun.
Use the mcmc_diagnostic function, to check the model’s
convergence.
# Display diagnostics for Bayesian methods
mcmc_diagnostics(net = res,
par = c("EM", "tau"))
Here, we illustrate a panel of bar plots on the log odds ratio for all
possible comparisons in the network (x-axis):
It displays diagnostic plots (trace, density, and autocorrelation) for
the monitored parameters specified in the
par argument
(here, the log odds ratio for all possible pairwise comparisons,
"EM", and the common between-trial standard deviation,
"tau"). Access all monitored parameters from
run_model using res$jagsfit.
Use the league_heatmap function to create the popular
league table:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The heatmap league table for estimation
league_heatmap(full1 = res,
drug_names1 = interv_names)
The table is read as row versus column and shows the (posterior mean)
odds ration and 95% credible interval for each comparison The
interventions are sorted in decreasing order by their posterior mean
SUCRA value shown in the main diagonal. The larger the treatment effect,
the darker the color shade. The league_heatmap function can
also display the treatment effects from two outcomes.
The league_heatmap_pred function has the same
arguments as league_heatmap, but it displays
predictions for all possible pairwise comparisons in the network:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The heatmap league table for prediction
league_heatmap_pred(full1 = res,
drug_names1 = interv_names)
A compact alternative to the league table is to create a forest plot of comparisons with a selected comparator intervention. For illustration, we have selected budesidone as the reference and we ran the following code to obtain the forest plot where results on estimation and prediction co-appear:
# Abbreviated interventions (in the order they appear in the datasets)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# The heatmap league table for prediction
forestplot(full = res,
compar = "BUDE",
drug_names = abbr_interv_names)
In plot A), the 95% credible intervals (black lines)
overlay the 95% prediction intervals (orange lines). In both plots, the
interventions are sorted from best to worst based on their SUCRA value
(see plot B)).
The rankosucra_plot function creates a plot for each
intervention that combines two popular hierarchy measures: the rankogram
and SUCRA plot:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The rankogram-SUCRA-plot 'amalgamation'
rankosucra_plot(full1 = res,
drug_names1 = interv_names)
The interventions are sorted from best to worst based on their SUCRA
value (the number on the top left corner of each plot). The
rankosucra_plot function can also display the hierarchy
results from two outcomes; type ?rankosucra_plot for more
details.
The functions run_series_meta and
series_meta_plot are designed to explore the implications
of assuming consistency and common heterogeneity, as opposed to
performing pairwise meta-analysis separately for each observed
comparison in the network.
First, use the run_series_meta function to conduct
separate pairwise meta-analyses for all observed comparisons in the
network:
# Run separate pairwise meta-analyses
meta <- run_series_meta(full = res)
Here, the function inherits the arguments n_chains,
n_iter, n_burning, and n_thin
from the run_model function via res; however,
you can also specify these arguments anew. Just as with the
run_model, the console prints the progress of simulations
together with the number of models to run: there are 15 observed
comparisons in the network; hence, the console prints this message (in
red), followed by a progress bar for each analysis alongside the model
update until convergence:
Next, we insert meta into the function
series_meta_plot to experience the visualisation toolkit
for this analysis:
# Abbreviated interventions (in the order they appear in the datasets)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for a series of pairwise meta-analyses
series_meta_plot(full = res,
meta = meta,
drug_names = abbr_interv_names)
This function returns the same results in two formats: (i) a table that is printed on the console, and (ii) a panel of forest plots.
Network meta-analysis produced more precise results than pairwise meta-analysis for all observed comparisons (plot A)). The benefits of conducting network meta-analysis for this example are also evident for the heterogeneity parameter, which was estimated at a reasonable level and with greater precision than in any pairwise meta-analysis (plot B)): blue vertical lines refer to the posterior median and 95% credible interval of the common heterogeneity parameter in network meta-analysis.
You can export the table to an xlsx file in your working
directory by adding the argument save_xls = TRUE in the
series_meta_plot function.
Note that comparing network meta-analysis with pairwise meta-analysis results should never be used to draw conclusions about the possibility of consistency assumption! There are specific tools designed to evaluate the consistency assumption. In the next part of the tutorial, we will introduce you to functions that offer local and global assessment of the consistency assumption.
Do you see at least one closed-loop of interventions in your network? Are these loops informed by two-arm trials only, or a combination of two-arm and multi-arm trials? If so, you must evaluate whether network meta-analysis produces sensible results under the consistency assumption or if there are loops with evidence of possible inconsistency. Our COPD network has several closed loops (see the network plot above).
The run_nodesplit function applies the node-splitting
approach with the synergy of the R package gemtc to select the
proper nodes to split (read van Valkenhoef and
colleagues for more details on which nodes to split when
multi-arm trials are also present in the loops):
# Run the node-splitting approach
node <- run_nodesplit(full = res)
Just as with the run_series_meta function,
run_nodesplit can inherit the MCMC-related arguments from
the run_model function via res, or we can
specify these arguments anew. Likewise, the console prints the progress
of simulations and the number of node-splitting models to run. There are
8 such models (the message in red), and each one of these models will
also be updated until convergence:
Next, we insert
node into the function
nodesplit_plot to get the necessary results both in
tabular and graphical format:
# Abbreviated interventions (in the order they appear in the datasets)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for the consistency evaluation locally
nodesplit_plot(full = res,
node = node,
drug_names = abbr_interv_names)
Just as with the series_meta_plot function, you can
export each table to an xlsx file at your working directory by
inserting the argument save_xls = TRUE in the
nodesplit_plot function.
The first graph is a panel of interval plots to check consistency through the inconsistency factor (the difference between direct and indirect estimate):
For each split node, we can see the estimated direct effect (from pairwise meta-analysis), the indirect effect (after removing the corresponding comparison from the network), and the inconsistency factor (IF). The deviance information criterion for each model appear on the top left of each plot. The plots have been sorted in ascending order of the deviance information criterion. The 95% credible interval for the inconsistency factor crosses the vertical line (of consistency) in all plots. One may conclude that the consistency assumption holds; however, the point estimate is not even close to zero in all plots. Thinking hat on the head.
The next graph is a ‘reverse’ forest plot on the between-trial standard deviation after each split node:
Again, the split nodes have been sorted in ascending order of the deviance information criterion. The blue horizontal lines refer to the posterior median and 95% credible interval of the common heterogeneity parameter in network meta-analysis.
Last in the methods that evaluate the assumptions of network
meta-analysis is network meta-regression. We will investigate whether
the publication year may have a modification effect. In other words, do
intervention effects change over time? We created the vector
cov with the publication year of the trials in the order
they appear in the dataset nma.baker2009.
Run the following code to conduct network meta-regression using the
function run_metareg:
# The year of publication
cov <- c(1996, 1998, 1999, rep(2000, 2), 2001, rep(2002, 5), rep(2003, 2), rep(2005, 4), rep(c(2006, 2007), each = 2))
# Run the unrelated means effect model
reg <- run_metareg(full = res,
covariate = cov,
covar_assumption = "exchangeable")
As you may have guessed by now, we insert
reg into the
function metareg_plot to obtain several results in
tabular and graphical format:
# Abbreviated interventions (in the order they appear in the datasets)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for network meta-regression
metareg_plot(full = res,
reg = reg,
compar = "BUDE",
cov_value = list(2002, "Publication year"),
drug_names = abbr_interv_names)
In case you were wondering why a simulation progress appears on
the R console, the posterior mean and standard deviation of the
basic parameters and corresponding regression coefficients (as estimated
from run_metareg) are inserted into the BUGS code to obtain
the posterior distribution of SUCRA at the selected value of the
covariate (cov_value).
Overall, the network meta-regression results do not seem to differ substantially from the network meta-analysis results. The posterior mean of the regression coefficient for each intervention (last table) is another hint that the intervention effects may not change over time systematically: the ratio of odds ratios between two subsequent years almost equals 1.
As a reminder, you may insert the argument
save_xls = TRUE in the metareg_plot function
to export each table to an xlsx file at your working
directory.
Just like the
forestplot function, the 95% credible
intervals (black lines) overlap the 95% prediction intervals (coloured
lines) in plot A). The odds ratios refer to comparisons
with budesidone, the selected comparator intervention
(compar = "BUDE"). In both plots, the interventions are
sorted from best to worst based on their SUCRA value (see plot
B)).
The hierarchy of the interventions does not materially change when
comparing the two models regarding the SUCRA values for the publication
year 2002.
The main functionalities of rnmamod have been presented
in this tutorial. The interested reader is prompted to browse through
the manual
to dive in the technical details of the functions.
For any questions about the tutorial and the rnmamod R package, feel free to contact Spineli.Loukia@mh-hannover.de. For any information about the author, visit loukiaspineli.netlify.app.